On Retaining Intermediate Probabilistic Models When Building Bayesian Networks
نویسندگان
چکیده
The process of building a Bayesian network may occur in stages, in which intermediate Bayesian networks are built during preliminary processing and then used in the construction of further Bayesian networks. For example, in (Doshi, Greenwald, & Clarke 2001) we describe a way to use Bayesian networks to model and correct errors in noisy datasets. The corrected datasets are then used in (Doshi 2001) to build predictive Bayesian networks. Through this process we built networks that capture probabilistic relationships between 412 fields of data from 169,512 patients admitted to trauma centers in Pennsylvania and registered in the Pennsylvania Trauma Systems Foundation Trauma Registry between 1986 and 1999. In the process mentioned above, intermediate Bayesian networks were used to find the most likely values for fields found to have errors. These most likely values were then used in the cleansed dataset. However, in the subsequent process of building Bayesian networks from this dataset, we questioned whether or not these intermediate networks used in error correction should have been retained. In other words, we wanted to understand the tradeoffs involved in retaining the distributional information summarized in each error-correction network rather than just retaining the most likely value for each corrected field. This question can be generalized to any process of building a Bayesian network in stages. This note describes preliminary work to understand these issues. An important component of this staged network building process is that common variables are represented from one stage to the next. In data cleansing, variables used to query for error distributions are the same variables that are used as evidence variables in the final predictive network. Furthermore, the context variables used to model errors are also represented directly in the final network. Retaining distribution information can be accomplished by employing networks from early stages within the subsequent networks. Common variables limit the potential blow-up in network size.
منابع مشابه
Rule-based joint fuzzy and probabilistic networks
One of the important challenges in Graphical models is the problem of dealing with the uncertainties in the problem. Among graphical networks, fuzzy cognitive map is only capable of modeling fuzzy uncertainty and the Bayesian network is only capable of modeling probabilistic uncertainty. In many real issues, we are faced with both fuzzy and probabilistic uncertainties. In these cases, the propo...
متن کاملLoad-Frequency Control: a GA based Bayesian Networks Multi-agent System
Bayesian Networks (BN) provides a robust probabilistic method of reasoning under uncertainty. They have been successfully applied in a variety of real-world tasks but they have received little attention in the area of load-frequency control (LFC). In practice, LFC systems use proportional-integral controllers. However since these controllers are designed using a linear model, the nonlinearities...
متن کاملA Hybrid Bayesian Network Modeling Environment
Bayesian networks are a powerful method for building probability models. But the formalism does not support incremental model development and reuse of models. This is partly due to the fact that Bayesian networks require precise probability values, while incremental model development and model reuse require the ability to abstract probability information. We present a formalism called hybrid Ba...
متن کاملProbabilistic Contaminant Source Identification in Water Distribution Infrastructure Systems
Large water distribution systems can be highly vulnerable to penetration of contaminant factors caused by different means including deliberate contamination injections. As contaminants quickly spread into a water distribution network, rapid characterization of the pollution source has a high measure of importance for early warning assessment and disaster management. In this paper, a methodology...
متن کاملFrom Probabilistic Horn Logic to Chain Logic
Probabilistic logics have attracted a great deal of attention during the past few years. Where logical languages have, already from the inception of the field of artificial intelligence, taken a central position in research on knowledge representation and automated reasoning, probabilistic graphical models with their associated probabilistic basis have taken up in recent years a similar positio...
متن کامل